Improving Contrast Set Mining

نویسندگان

  • Mondelle Simeon
  • Robert Hilderman
چکیده

A fundamental task in exploratory data analysis is discerning the differences between contrasting groups. Contrast set mining has been developed as a data mining task, which aims to identify the differences between these groups. This paper examines the algorithms, heuristics, and open issues of contrast set mining, and seeks to improve contrast set mining by addressing several of the open issues. It proposes four interestingness measures for ranking contrast sets: coverage, overall support, growth rate, and unusualness. It introduces a new method to discretize quantitative attributes. A new type of contrast set, called the jumping contrast set, is defined, and the contrast set mining process is modified, to include mining both types of contrast sets, on datasets containing both quantitative and categorical attributes. Finally, a simple visualization method is introduced, to describe contrast sets to the end-user.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supporting Factors to Improve the Explanatory Potential of Contrast Set Mining: Analyzing Brain Ischaemia Data

The goal of exploratory pattern mining is to find patterns that exhibit yet unknown relationships in data and to provide insightful representations of detected relationships. This paper explores contrast set mining and an approach to improving its explanatory potential by using the so called supporting factors that provide additional descriptions of the detected patterns. The proposed methodolo...

متن کامل

Mining Interesting Contrast Sets

Contrast set mining has been developed as a data mining task which aims at discerning differences across groups. These groups can be patients, organizations, molecules, and even time-lines. A valid contrast set is a conjunction of attribute-value pairs that differ significantly in their distribution across groups. The search for valid contrast sets can produce a prohibitively large number of re...

متن کامل

Contrast Set Mining Through Subgroup Discovery Applied to Brain Ischaemina Data

Contrast set mining aims at finding differences between different groups. This paper shows that a contrast set mining task can be transformed to a subgroup discovery task whose goal is to find descriptions of groups of individuals with unusual distributional characteristics with respect to the given property of interest. The proposed approach to contrast set mining through subgroup discovery wa...

متن کامل

A novel feature selection techniques based on contrast set mining

Data classification is a challenging task in era of big data due to high number of features. Feature selection is a step in process of knowledge discovery in data that aims to reduce dimensionality and improve the classification performance. The purpose of this research is to define new techniques for feature selection in order to improve classification accuracy and reduce the time required for...

متن کامل

Mining Discrimination Patterns along Temporal Databases

In certain Data Analysis tasks, understanding the underlying differences between groups or classes is of the utmost importance. Contrast Set Mining relies on discovering significant patterns by contrasting two or more groups. A Contrast Set is a conjunction of attribute-value pairs that differ meaningfully in its distribution across groups. One technique proposed is Rules for Contrast Sets (RCS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008